Constituency to Dependency Translation with Forests

نویسندگان

  • Haitao Mi
  • Qun Liu
چکیده

Tree-to-string systems (and their forestbased extensions) have gained steady popularity thanks to their simplicity and efficiency, but there is a major limitation: they are unable to guarantee the grammaticality of the output, which is explicitly modeled in string-to-tree systems via targetside syntax. We thus propose to combine the advantages of both, and present a novel constituency-to-dependency translation model, which uses constituency forests on the source side to direct the translation, and dependency trees on the target side (as a language model) to ensure grammaticality. Medium-scale experiments show an absolute and statistically significant improvement of +0.7 BLEU points over a state-of-the-art forest-based tree-to-string system even with fewer rules. This is also the first time that a treeto-tree model can surpass tree-to-string counterparts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تبدیل خودکار درخت‌بانک وابستگی فارسی به درخت‌بانک سازه‌ای

There are two major types of treebanks: dependency-based and constituency-based. Both of them have applications in natural language processing and computational linguistics. Several dependency treebanks have been developed for Persian. However, there is no available big size constituency treebank for this language. In this paper, we aim to propose an algorithm for automatic conversion of a depe...

متن کامل

Translation with Source Constituency and Dependency Trees

We present a novel translation model, which simultaneously exploits the constituency and dependency trees on the source side, to combine the advantages of two types of trees. We take head-dependents relations of dependency trees as backbone and incorporate phrasal nodes of constituency trees as the source side of our translation rules, and the target side as strings. Our rules hold the property...

متن کامل

Dependency and Constituency in Translation Shift Analysis

Exploiting data from a parallel treebank recently developed for Italian, English and French, the paper discusses issues related to the development of a dependency-based alignment system. We focus on the alignment of linguistic expressions and constructions which are structurally different in the languages that have to be aligned, and on how to deal with them using dependency rather than constit...

متن کامل

A Look at Tesnière's Éléments through the Lens of Modern Syntactic Theory

A recent project to produce a much belated English translation of Lucien Tesnière’s Éléments de syntaxe structurale has provided the opportunity for an in depth look at Tesnière’s theory of syntax. This contribution examines a few aspects of Tesnière’s work through the lens of modern syntactic theory. Tesnière’s understandings of constituents and phrases, auxiliary verbs, prepositions, gapping,...

متن کامل

Statistical Dependency Parsing of Four Treebanks

Multilingual dependency parsing is gaining popularity in recent years for several reasons. Dependency structures are more adequate for languages with freer word order than the traditional constituency notion. There is a growing availability of dependency treebanks for new languages. Broad coverage statistical dependency parsers are available and easily portable to new languages. Dependency pars...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010